1 Introduction

Team A are the following members: Amal Alqahtani, Peng Jiaxiang, Elahi Naureen, and Mu Xinya. You may find our work over on GitHub.

Coronavirus disease-19 (COVID-19) has spread rapidly around the world, and many risk factors have been hypothesized to affect case and death rates. We felt that a relevant discussion to have would be What are the most regions with the highest number of deaths? What can we say about patient demographics? Does race considered a significant risk factor for increased Covid19 incidence in the United States?’ Have there been any general trends amongst the health conditions? These questions are all suited to EDA, and with these questions in mind, we want to see if we could find data on Covid19 that would be readily available for us to analyze. Eventually, our question morphed into the following: What are the factors (i.e. patient demographics, social determinants of health, environmental variables, underlying health conditions, country of origin) affecting COVID-19 numbers of cases and death rate among different geographical locations in US?

We were able to find a dataset called Covid-19-Dataset on Github over here: https://github.com/johndurbin93/Covid-19-Dataset. This dataset includes COVID-19 confirmed case number and death number through April 14, 2020 which were obtained for each U.S. county from the Center for Systems Science and Engineering (CSSE) Coronavirus Resource Center at Johns Hopkins University. Race demographics for counties was obtained from the County Health Rankings and Roadmaps Program database. Daily temperature data for counties was obtained from the National Oceanic and Atmospheric Administration. This data was compiled by a group of reserchers.

The report is organized as follows:

  1. Summary of Dataset
  2. Description of Data
  3. What are the patient demographics? (demographics Data)
  4. Where do …… come from? or What are the most regions with the highest number of deaths? (country of origin Data)
  5. What are the social determinants of health? (social determinants of health Data)
  6. Environmental variables Data Goes Here?
  7. Are there any common underlying health conditions? (health conditions Data)
  8. Have there been any ……. trends among patient over the times? (Trends Over Time)
  9. Are patient a certain race? (race Data)
  10. Discussion And Conclusion

2 Summary of Dataset

The data looks like the following:

   Province            State            State Code           Latitude   
   Longitude          Tests        Days Since 1st Case Total cases (4/14/2020)
 Deaths (4/14/2020) Population (for demographic %'s)
 % less than 18 years of age % 65 and over     % Black    
 % American Indian & Alaska Native    % Asian    
 % Native Hawaiian/Other Pacific Islander   % Hispanic   % Non-Hispanic White
 % Not Proficient in English    % Female      % Rural         
 Population Density per Square mile of Land (2010)
 Housing Density Per Square Mile of Land
 Avg Daily March 2011 Sunlight (KJ/m²) Missing HI and AK    GDP 2018        
   GDP/capita   Percentage Living in Poverty, All Ages, 2016
 Air Quality, Annual Average Ambient Concentrations of PM2.5, 2014
 Primary Care Physicians Ratio Dentist Ratio      Mental Health Provider Ratio
 High School Graduation Rate % Some College  % Unemployed      
 % Children in Poverty Income Inequality Ratio (80th%/20th%)
 % Single-Parent Households Social Association Rate Violent Crime Rate
 Air pollution: Average Daily PM2.5 Presence of Drinking Water Violation
 % Severe Housing Problems Housing: Severe Cost Burden Housing: Overcrowding
 Housing: Inadequate Facilities % Drive Alone to Work
 % Long Commute - Drives Alone Sleep <7 Hours_Percent Sleep <7 Hours_CI_Low
 Sleep <7 Hours_CI_High Diabetes Total Percentage Diabetes Male Percentage
 Diabetes Female Percentage
 Coronary Heart Disease Death Rate per 100,000, All Ages, All Races/Ethnicities, Both Genders, 2014-2016
 Hypertension Death Rate per 100,000 (any mention), 35+, All Races/Ethnicities, Both Genders, 2014-2016
 Obesity, Age-Adjusted Percentage, 20+. 2015 % Fair or Poor Health
 Average Number of Physically Unhealthy Days
 Average Number of Mentally Unhealthy Days % Low Birthweight 
 % Smokers (adults) % Adults with Obesity Food Environment Index
 % Physically Inactive % With Access to Exercise Opportunities
 % Excessive Drinking % Uninsured       
 Preventable Hospitalization Rate (Preventable hospital stays)
 % With Annual Mammogram % Flu Vaccinated  
 Chronic Respiratory Disease: mortality rate per 100K (2014)
 Liver Disease: crude mortality rate per 100K (1999-2018)
 Liver Disease: % of Total Deaths (1999-2018)
 Liver Disease: crude mortality rate per 100K (2018)
 Liver Disease: % of Total Deaths (2018) Avg Temp Peak Growth-10 Rate
 Avg Temp 10 Before First-Current Avg Temp First-Current
   First Case                   Stay At Home                    No Cases    
 No Stay At Home Order Stay At Home Order After First Case
 [ reached getOption("max.print") -- omitted 7 rows ]
tibble [3,144 × 83] (S3: tbl_df/tbl/data.frame)
 $ Province                                                                                               : chr [1:3144] "Autauga" "Baldwin" "Barbour" "Bibb" ...
 $ State                                                                                                  : chr [1:3144] "Alabama" "Alabama" "Alabama" "Alabama" ...
 $ State Code                                                                                             : chr [1:3144] "AL" "AL" "AL" "AL" ...
 $ Latitude                                                                                               : num [1:3144] 32.5 30.7 31.9 33 34 ...
 $ Longitude                                                                                              : num [1:3144] -86.6 -87.7 -85.4 -87.1 -86.6 ...
 $ Tests                                                                                                  : num [1:3144] 33835 33835 33835 33835 33835 ...
 $ Days Since 1st Case                                                                                    : num [1:3144] 22 31 12 16 21 20 21 28 27 21 ...
 $ Total cases (4/14/2020)                                                                                : num [1:3144] 23 87 11 17 16 8 8 62 216 9 ...
 $ Deaths (4/14/2020)                                                                                     : num [1:3144] 1 2 0 0 0 0 0 0 10 0 ...
 $ Population (for demographic %'s)                                                                       : num [1:3144] 55601 218022 24881 22400 57840 ...
 $ % less than 18 years of age                                                                            : num [1:3144] 23.7 21.6 20.9 20.5 23.2 ...
 $ % 65 and over                                                                                          : num [1:3144] 15.6 20.4 19.4 16.5 18.2 ...
 $ % Black                                                                                                : num [1:3144] 19.34 8.78 48.03 21.12 1.46 ...
 $ % American Indian & Alaska Native                                                                      : num [1:3144] 0.48 0.772 0.659 0.438 0.654 ...
 $ % Asian                                                                                                : num [1:3144] 1.225 1.15 0.454 0.237 0.32 ...
 $ % Native Hawaiian/Other Pacific Islander                                                               : num [1:3144] 0.112 0.067 0.185 0.116 0.121 ...
 $ % Hispanic                                                                                             : num [1:3144] 2.97 4.65 4.28 2.62 9.57 ...
 $ % Non-Hispanic White                                                                                   : num [1:3144] 74.3 83.1 45.6 74.6 86.9 ...
 $ % Not Proficient in English                                                                            : num [1:3144] 0.82 0.544 1.632 0.268 1.725 ...
 $ % Female                                                                                               : num [1:3144] 51.4 51.5 47.2 46.8 50.7 ...
 $ % Rural                                                                                                : chr [1:3144] "42.002162300000002" "42.279099100000003" "67.789634699999993" "68.352607500000005" ...
 $ Population Density per Square mile of Land (2010)                                                      : num [1:3144] 91.8 114.6 31 36.8 88.9 ...
 $ Housing Density Per Square Mile of Land                                                                : num [1:3144] 37.2 65.5 13.4 14.4 37 7.2 12.8 88 28.5 29.4 ...
 $ Avg Daily March 2011 Sunlight (KJ/m²) Missing HI and AK                                                : num [1:3144] 18450 18855 18611 18235 17239 ...
 $ GDP 2018                                                                                               : num [1:3144] 1483414 5774289 787425 364197 849114 ...
 $ GDP/capita                                                                                             : num [1:3144] NA NA NA NA NA NA NA NA NA NA ...
 $ Percentage Living in Poverty, All Ages, 2016                                                           : chr [1:3144] "13.5" "11.7" "29.9" "20.100000000000001" ...
 $ Air Quality, Annual Average Ambient Concentrations of PM2.5, 2014                                      : chr [1:3144] "11.7" "10.3" "11.5" "11.2" ...
 $ Primary Care Physicians Ratio                                                                          : chr [1:3144] "92.500694444444449" "57.167361111111113" "131.62569444444446" "85.875694444444449" ...
 $ Dentist Ratio                                                                                          : chr [1:3144] "128.70902777777778" "84.125694444444449" "115.20902777777779" "186.66736111111109" ...
 $ Mental Health Provider Ratio                                                                           : chr [1:3144] "178.20902777777778" "43.250694444444441" "12441:1" "186.66736111111109" ...
 $ High School Graduation Rate                                                                            : chr [1:3144] "90" "86.361576799999995" "81.410256399999994" "83.763837600000002" ...
 $ % Some College                                                                                         : num [1:3144] 62 67.4 34.9 44.1 53.4 ...
 $ % Unemployed                                                                                           : chr [1:3144] "3.6290788599999999" "3.6153821599999998" "5.1713842100000003" "3.9718277299999998" ...
 $ % Children in Poverty                                                                                  : chr [1:3144] "19.3" "13.9" "43.9" "27.8" ...
 $ Income Inequality Ratio (80th%/20th%)                                                                  : num [1:3144] 5.23 4.42 5.68 4.37 4.43 ...
 $ % Single-Parent Households                                                                             : chr [1:3144] "26.2426791" "24.139601500000001" "56.603425999999999" "28.689236099999999" ...
 $ Social Association Rate                                                                                : num [1:3144] 12.07 10.21 7.52 8.38 8.45 ...
 $ Violent Crime Rate                                                                                     : chr [1:3144] "272.28222" "203.66039599999999" "414.27786099999997" "89.349125999999998" ...
 $ Air pollution: Average Daily PM2.5                                                                     : chr [1:3144] "11.7" "10.3" "11.5" "11.2" ...
 $ Presence of Drinking Water Violation                                                                   : chr [1:3144] "No" "No" "No" "No" ...
 $ % Severe Housing Problems                                                                              : num [1:3144] 14.7 13.6 14.6 10.5 10.5 ...
 $ Housing: Severe Cost Burden                                                                            : num [1:3144] 12.83 12.28 13.45 9.98 7.85 ...
 $ Housing: Overcrowding                                                                                  : num [1:3144] 1.202 1.271 1.689 0.255 1.891 ...
 $ Housing: Inadequate Facilities                                                                         : num [1:3144] 1.346 0.479 0.603 0.709 1.091 ...
 $ % Drive Alone to Work                                                                                  : num [1:3144] 86.5 84.3 83.4 84.9 86.2 ...
 $ % Long Commute - Drives Alone                                                                          : num [1:3144] 39.6 41.7 32.2 49.8 59.4 46 32.3 31.2 31.8 46.1 ...
 $ Sleep <7 Hours_Percent                                                                                 : num [1:3144] 35.9 33.3 38.6 38.1 35.9 ...
 $ Sleep <7 Hours_CI_Low                                                                                  : num [1:3144] 35 32.5 37.7 37.1 34.8 ...
 $ Sleep <7 Hours_CI_High                                                                                 : num [1:3144] 36.8 34.1 39.5 39.2 37.1 ...
 $ Diabetes Total Percentage                                                                              : num [1:3144] 9.9 8.5 15.7 13.3 14.9 22.4 16.9 15.6 17.5 12.2 ...
 $ Diabetes Male Percentage                                                                               : num [1:3144] 10.1 8.8 16.1 14.5 17 22.1 17.7 16.7 18.3 13.7 ...
 $ Diabetes Female Percentage                                                                             : num [1:3144] 9.8 8.4 15.4 12.2 12.9 22.6 16.4 14.7 16.7 10.7 ...
 $ Coronary Heart Disease Death Rate per 100,000, All Ages, All Races/Ethnicities, Both Genders, 2014-2016: chr [1:3144] "112.7" "108" "63" "63.8" ...
 $ Hypertension Death Rate per 100,000 (any mention), 35+, All Races/Ethnicities, Both Genders, 2014-2016 : chr [1:3144] "241.4" "171.5" "192.5" "142.19999999999999" ...
 $ Obesity, Age-Adjusted Percentage, 20+. 2015                                                            : chr [1:3144] "37.6" "31.3" "44.7" "37.9" ...
 $ % Fair or Poor Health                                                                                  : num [1:3144] 20.9 17.5 29.6 19.4 21.7 ...
 $ Average Number of Physically Unhealthy Days                                                            : num [1:3144] 4.74 4.22 5.43 4.59 4.86 ...
 $ Average Number of Mentally Unhealthy Days                                                              : num [1:3144] 4.65 4.3 5.19 4.55 4.89 ...
 $ % Low Birthweight                                                                                      : chr [1:3144] "8.6195286200000005" "8.3450031800000009" "11.474558699999999" "10.30871" ...
 $ % Smokers (adults)                                                                                     : num [1:3144] 18.1 17.5 22 19.1 19.2 ...
 $ % Adults with Obesity                                                                                  : num [1:3144] 33.3 31 41.7 37.6 33.8 37.2 43.3 38.5 40.1 35 ...
 $ Food Environment Index                                                                                 : chr [1:3144] "7.2" "8" "5.6" "7.8" ...
 $ % Physically Inactive                                                                                  : num [1:3144] 34.7 26.5 23.5 33.5 30.3 24.6 39.5 31.7 30.1 31.3 ...
 $ % With Access to Exercise Opportunities                                                                : chr [1:3144] "69.130124100000003" "73.713549" "53.166769899999998" "16.251363699999999" ...
 $ % Excessive Drinking                                                                                   : num [1:3144] 15 18 12.8 15.6 14.2 ...
 $ % Uninsured                                                                                            : chr [1:3144] "8.7216859499999995" "11.3334045" "12.2427925" "10.206252599999999" ...
 $ Preventable Hospitalization Rate (Preventable hospital stays)                                          : chr [1:3144] "7108" "4041" "6209" "5961" ...
 $ % With Annual Mammogram                                                                                : chr [1:3144] "41" "43" "45" "40" ...
 $ % Flu Vaccinated                                                                                       : chr [1:3144] "41" "44" "37" "38" ...
 $ Chronic Respiratory Disease: mortality rate per 100K (2014)                                            : num [1:3144] 81.8 54.3 69.8 84.5 87 ...
 $ Liver Disease: crude mortality rate per 100K (1999-2018)                                               : chr [1:3144] "15.805600800000001" "19.802747700000001" "12.9148204" "13.557851400000001" ...
 $ Liver Disease: % of Total Deaths (1999-2018)                                                           : chr [1:3144] "1.9061999999999999E-4" "8.1862999999999999E-4" "8.2999999999999998E-5" "7.0199999999999999E-5" ...
 $ Liver Disease: crude mortality rate per 100K (2018)                                                    : chr [1:3144] "Unreliable" "27.520158500000001" "NA" "NA" ...
 $ Liver Disease: % of Total Deaths (2018)                                                                : chr [1:3144] "1.7882999999999999E-4" "1.073E-3" "NA" "NA" ...
 $ Avg Temp Peak Growth-10 Rate                                                                           : num [1:3144] 20.5 22.1 18.5 18.4 18.2 ...
 $ Avg Temp 10 Before First-Current                                                                       : num [1:3144] 19.9 20.9 18.1 17.7 17.8 ...
 $ Avg Temp First-Current                                                                                 : num [1:3144] 19.3 21.8 17 16 17.5 ...
 $ First Case                                                                                             : POSIXct[1:3144], format: "2020-03-24" "2020-03-15" ...
 $ Stay At Home                                                                                           : POSIXct[1:3144], format: "2020-04-04" "2020-04-04" ...
 $ No Cases                                                                                               : num [1:3144] 0 0 0 0 0 0 0 0 0 0 ...
 $ No Stay At Home Order                                                                                  : num [1:3144] 0 0 0 0 0 0 0 0 0 0 ...
 $ Stay At Home Order After First Case                                                                    : num [1:3144] 1 1 1 1 1 1 1 1 1 1 ...

The Covid19 data has 83 columns and 3144 rows/entries, for a total of 260952 individual data points. The variables are the following:

  1. Province
  2. State
  3. Tests
  4. Total cases
  5. Deaths
  6. Population (for demographic %’s)
  7. % less than 18 years of age
  8. % 65 and over
  9. % Black
  10. % American Indian & Alaska Native
  11. % Asian
  12. % Native Hawaiian/Other Pacific Islander
  13. % Hispanic
  14. % Non-Hispanic White
  15. % Not Proficient in English
  16. % Female
  17. % Rural
  18. Sleep <7 Hours_Percent
  19. Sleep <7 Hours_CI_Low
  20. Sleep <7 Hours_CI_High
  21. Diabetes Total Percentage
  22. Diabetes Male Percentage
  23. Diabetes Female Percentage
  24. Coronary Heart Disease Death Rate per 100,000, All Ages , All Races/Ethnicities, Both Genders, 2014-2016
  25. Hypertension Death Rate per 100,000 (any mention), 35+, All Races/Ethnicities, Both Genders, 2014-2016
  26. Obesity, Age-Adjusted Percentage, 20+. 2015
  27. % Fair or Poor Health
  28. Average Number of Physically Unhealthy Days
  29. Average Number of Mentally Unhealthy Days
  30. % Low Birthweight
  31. % Smokers (adults)
  32. % Adults with Obesity
  33. Food Environment Index
  34. <<<<<<< Updated upstream
  35. % Physically Inactive
  36. % With Access to Exercise Opportunities
  37. % Excessive Drinking
  38. % Uninsured
  39. Preventable Hospitalization Rate (Preventable hospital stays)
  40. % With Annual Mammogram
  41. % Flu Vaccinated
  42. Chronic Respiratory Disease: mortality rate per 100K (2014)
  43. Liver Disease: crude mortality rate per 100K (1999-2018)
  44. Liver Disease: % of Total Deaths (1999-2018)
  45. Liver Disease: crude mortality rate per 100K (2018)
  46. Liver Disease: % of Total Deaths (2018)
  47. Avg Temp Peak Growth-10 Rate
  48. Avg Temp 10 Before First-Current
  49. Avg Temp First-Current

To prepare our data for EDA we dropped the ………: …….. NAs were also removed.

=======
  • Stay At Home Order After First Case
  • Percentage Living in Poverty
  • Social Association Rate
  • To prepare our data for EDA we clean the dataset and remove all NAs.

    tibble [3,144 × 15] (S3: tbl_df/tbl/data.frame)
     $ Province                                : chr [1:3144] "New York City" "Nassau" "Suffolk" "Westchester" ...
     $ State                                   : chr [1:3144] "New York" "New York" "New York" "New York" ...
     $ total_cases                             : num [1:3144] 110465 25250 22691 20191 16323 ...
     $ deaths                                  : num [1:3144] 7905 1001 608 596 577 ...
     $ Population (for demographic %'s)        : chr [1:3144] "8623000" "1358343" "1481093" "967612" ...
     $ % less than 18 years of age             : chr [1:3144] "20.9" "21.459675499999999" "21.134324500000002" "21.900513799999999" ...
     $ % 65 and over                           : chr [1:3144] "14.1" "17.763039200000001" "16.862951899999999" "17.053116299999999" ...
     $ % Black                                 : chr [1:3144] "24.3" "11.6331442" "7.3924459799999998" "13.8042935" ...
     $ % American Indian & Alaska Native       : chr [1:3144] "0.4" "0.54294092000000005" "0.61373593999999998" "0.95647842000000005" ...
     $ % Asian                                 : chr [1:3144] "13.9" "10.4504532" "4.1896086199999996" "6.43553408" ...
     $ % Native Hawaiian/Other Pacific Islander: chr [1:3144] "0.1" "0.1" "9.5899999999999999E-2" "0.13228443000000001" ...
     $ % Hispanic                              : chr [1:3144] "29.1" "17.231362000000001" "19.775260599999999" "25.140345499999999" ...
     $ % Non-Hispanic White                    : chr [1:3144] "32.1" "59.333835399999998" "67.190378999999993" "53.088118000000001" ...
     $ % Not Proficient in English             : chr [1:3144] "9" "5.3660427200000003" "4.00639637" "6.3180527499999997" ...
     $ % Female                                : chr [1:3144] "52.3" "51.306334300000003" "50.771693599999999" "51.559095999999997" ...

    3 Chapter 3: Independent Variables EDA

    3.1 United States COVID-19 Cases and Deaths by Provinces (Cities)

    3.1.1 What are the top 15 Provinces based on the number of cases?

    The following bar chart shows the top 15 cities by number of Covid19 cases.

    3.1.2 What are the top 15 Provinces based on the number of deaths?

    The following bar chart shows the top 15 cities by number of deaths.

    3.1.3 What is the average cases for each State?

                      State total_cases
    1               Alabama       59.03
    2                Alaska        9.83
    3               Arizona      258.60
    4              Arkansas       19.44
    5            California      437.43
    6              Colorado      122.41
    7           Connecticut     1682.00
    8              Delaware      638.33
    9  District of Columbia     2058.00
    10              Florida      323.07
    11              Georgia       85.74
    12               Hawaii      101.60
    13                Idaho       33.32
    14             Illinois      227.48
    15              Indiana       94.12
    16                 Iowa       19.20
    17               Kansas       13.84
    18             Kentucky       17.32
    19            Louisiana      335.34
    20                Maine       45.88
    21             Maryland      394.75
    22        Massachusetts     1843.87
    23             Michigan      316.96
    24            Minnesota       19.10
    25          Mississippi       37.68
    26             Missouri       40.98
    27              Montana        7.18
    28             Nebraska        9.48
    29               Nevada      184.35
    30        New Hampshire      103.50
    31           New Jersey     3196.29
    32           New Mexico       40.82
    33             New York     3274.52
    34       North Carolina       51.20
    35         North Dakota        6.45
    36                 Ohio       82.81
    37             Oklahoma       28.51
     [ reached 'max' / getOption("max.print") -- omitted 14 rows ]

    3.1.4 What is the average deaths for each State?

                      State  deaths
    1               Alabama   1.701
    2                Alaska   0.172
    3               Arizona   7.133
    4              Arkansas   0.427
    5            California  13.328
    6              Colorado   5.109
    7           Connecticut  83.375
    8              Delaware  14.333
    9  District of Columbia  67.000
    10              Florida   7.836
    11              Georgia   3.270
    12               Hawaii   1.800
    13                Idaho   0.750
    14             Illinois   8.510
    15              Indiana   4.207
    16                 Iowa   0.444
    17               Kansas   0.657
    18             Kentucky   0.900
    19            Louisiana  15.922
    20                Maine   1.250
    21             Maryland  12.667
    22        Massachusetts  49.600
    23             Michigan  21.133
    24            Minnesota   0.920
    25          Mississippi   1.366
    26             Missouri   1.284
    27              Montana   0.143
    28             Nebraska   0.161
    29               Nevada   7.059
    30        New Hampshire   0.300
    31           New Jersey 133.476
    32           New Mexico   0.939
    33             New York 174.871
    34       North Carolina   1.130
    35         North Dakota   0.151
    36                 Ohio   3.705
    37             Oklahoma   1.403
     [ reached 'max' / getOption("max.print") -- omitted 14 rows ]

    3.1.5 Which cities had the greatest % of population of people with poor health?

    3.2 Patient Demographics

    3.2.1 What are the patient demographics?

    Table: Statistics summary.
    TC Population young old black AIAN Asian NH Hispanic NHW Female Poverty Social
    Min 0 88 0.0 4.8 0.0 0.0 0.0 0.0 0.6 2.7 26.8 3.4 0.0
    Q1 2 11034 20.1 16.3 0.7 0.4 0.5 0.0 2.4 64.7 49.4 11.4 8.2
    Median 9 25758 22.1 19.0 2.2 0.6 0.7 0.1 4.4 83.5 50.3 14.8 11.1
    Mean 191 105871 22.1 19.3 8.8 2.4 1.5 0.1 9.6 76.2 49.9 15.9 11.6
    Q3 39 67013 23.8 21.8 9.6 1.3 1.4 0.1 9.9 92.3 51.0 19.0 14.4
    Max 110465 10105518 42.0 57.6 85.4 92.5 43.4 48.9 96.4 97.9 56.9 48.6 52.3

    From the average of the output results, we can see that the average proportion of teenagers under the age of 18 is 22.1%, and the average proportion of people over 65 is 19.3%. The largest number of all races is Non-Hispanic White, with an average proportion of 76.2. The average proportion of women is 49.9, the average proportion of the poor is 15.9%, and the average of the Social Association Rate is 11.6. We divide the data into four levels according to total cases.

    3.2.2 Which race is the majority of the sample?

    According to the average value, we get a pie chart of race proportions, from which we can see the overall proportions of different races. In the following, we will study the proportion of which race is related to the number of confirmed cases and the number of deaths.

    3.3 Stay at home policy in each province

    3 Descriptive Statistics

    3.1 Patient Demographics (Impact of Race)

    3.2 Underlying Health Conditions

    <<<<<<< Updated upstream

    3.3 Impact of Temperature

    =======

    3.5 Impact of Temperature

    4 Chapter 4: Independent Variables EDA: Boxplots, Scatterplots, ANOVA, & Chi-Square

    [1]      0      2      9     39 110465
    
        Shapiro-Wilk normality test
    
    data:  df2$TC
    W = 0.05, p-value <0.0000000000000002
    
        Bartlett test of homogeneity of variances
    
    data:  TC by rank
    Bartlett's K-squared = 25341, df = 3, p-value <0.0000000000000002

    The Shapiro-Wilk test is used to test whether the data conforms to the normal distribution. H0: The sample data is not significantly different from the normal distribution H1: The sample data is significantly different from the normal distribution The p-value is less than 0.05, the null hypothesis is rejected, and the total cases do not conform to the normal distribution.

    Test for homogeneity of variance H0: The variances of the groups are not significantly different H1: The variances of several groups are significantly different The result shows that the p value is less than 0.05, rejecting the null hypothesis, and total cases do not meet the homogeneity of variance.

    4.5 SMART Question:

    >>>>>>> Stashed changes

    3.4 Social Determinants of Health???

    3.5 Environmental Variables??

    4 Normality Test of Numerical Columns